System and method for least mean fourth adaptive filtering

ABSTRACT

The system and method for least mean fourth adaptive filtering is a system that uses a general purpose computer or a digital circuit (such as an ASIC, a field-programmable gate array, or a digital signal processor that is programmed to utilize a normalized least mean fourth algorithm. The normalization is performed by dividing a weight vector update term by the fourth power of the norm of the regressor.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to digital signal processing techniques and signal filtering, and particularly to a system and method for least mean fourth adaptive filtering that provides an adaptive filter and method of adaptive filtering utilizing a normalized least mean fourth algorithm.

2. Description of the Related Art

The least mean fourth (LMF) algorithm has uses in a wide variety of applications. The algorithm outperforms the well-known least mean square (LMS) algorithm in cases with non-Gaussian noise. Even in Gaussian environments, the LMF algorithm can outperform the LMS algorithm when initialized far from the Wiener solution. The true usefulness of the LMF algorithm lies in its faster initial convergence and lower steady-state error relative to the LMS algorithm. More importantly, its mean-fourth error cost function yields better performance than that of the LMS for noise of sub-Gaussian nature, or noise with a light-tailed probability density function. However, this higher-order algorithm requires a much smaller step-size to ensure stable adaptation, since the cubed error in the LMF gradient vector can cause devastating initial instability, resulting in unnecessary performance degradation.

One approach to solving the above degradation problem is to normalize the weight update term of the algorithm. In conventional techniques, the LMF algorithm is normalized by dividing the weight vector update term by the squared norm of the regressor. In some prior art techniques, this normalization is performed by dividing the weight vector update term by a weighted sum of the squared norm of the regressor and the squared norm of the estimation error vector. Thus, the LMF algorithm is normalized by both the signal power and the error power. Combining the signal power and the error power has the advantage that the former normalizes the input signal, while the latter can dampen down the outlier estimation errors, thus improving stability while still retaining fast convergence.

The above has been modified by using an adaptive, rather than fixed, mixing parameter in the weighted sum of the squared norm of the regressor and the squared norm of the estimation error vector. The adaptation of the mixing parameter improves the tracking properties of the algorithm. However, unlike the normalization of the LMS algorithm, the above normalization techniques of the LMF algorithm do not protect the algorithm from divergence when the input power of the adaptive filter increases. In fact, as will be shown below, the above prior art normalized LMF (NLMF) algorithms diverge when the input power of the adaptive filter exceeds a threshold that depends on the step-size of the algorithm. The reason for this drawback is that in all of the above techniques, the weight vector update term, which is a fourth order polynomial of the regressor, is normalized by a second order polynomial of the regressor.

The prior art normalization techniques do not ensure a normalization of the input signal. Thus, the algorithm stability remains dependent on the input power of the adaptive filter.

For the particular application of adaptive plant identification, as diagrammatically illustrated in FIG. 1, the output a_(k) of the unknown plant 100 is given by:

a _(k) =g ^(T) x _(k) +b _(k)  (1)

where

g≡(g ₁ ,g ₂ , . . . ,g _(N))^(T)  (2)

is a vector composed of the plant parameters g_(i) (from an unknown finite impulse response (FIR) filter 102), and

x _(k)=(x _(k) ,x _(k−1) ,x _(k−2) , . . . ,x _(k−N+1))^(T)  (3)

is the regressor vector at time k. N is the number of plant parameters, x_(k) is the plant input, b_(k) is the plant noise, and the notation (.)^(T) represents the transpose of (.). The identification of the plant is made by an adaptive finite impulse response (FIR) filter 104 whose length is assumed equal to that of the plant. The weight vector h_(k) of the adaptive filter is adapted on the basis of the error e_(k), which is given by

e _(k) =a _(k) −h _(k) ^(T) x _(k),  (4)

where h_(k)=(h_(1,k), h_(2,k), . . . , h_(N,k))^(T). The adaptation algorithm of interest is the LMF algorithm, which is described by

h _(k+1) =h _(k) +ƒe _(k) ³ x _(k),  (5)

where μ>0 is the algorithm step-size. The error signal e_(k) can be decomposed into two terms as follows:

e _(k) ≡b _(k)+ε_(k).  (6)

The first term on the right hand side of equation (6), b_(k), is the plant noise. The second term, ε_(k), is the excess estimation error. The weight deviation vector is defined by

v _(k) ≡h _(k) −g.  (7)

From equations (1), (4), (6) and (7),

ε_(k) =−v _(k) ^(T) x _(k).  (8)

Inserting equations (1), (4) and (7) into equation (5) yields

v _(k+1) =v _(k)+μ(b _(k) −v _(k) ^(T) x _(k))³ x _(k).  (9)

In the above, the following assumptions are used: The first assumption (assumption A1) is that the sequences {x_(k)} and {b_(k)} are mutually independent. The second assumption (assumption A2) is that {x_(k)} is a stationary sequence of zero mean random variables with a finite variance σ_(x) ². The third assumption (assumption A3) is that {b_(k)} is a stationary sequence of independent zero mean random variables with a finite variance σ_(b) ². Such assumptions are typical in the context of adaptive filtering.

In order to emphasize the need for normalization in the least mean fourth algorithm, examining the normalization of the LMS algorithm is important. The stability of the LMS algorithm is dependent upon the input power of the adaptive filter. This makes it very hard, if not impossible, to choose a step-size that guarantees stability of the algorithm when there is lack of knowledge about the input power. This is solved by normalizing the weight update term by ∥x_(k)∥², where ∥x_(k)∥ is the Euclidean norm of the vector x_(k), which is defined as ∥x_(k)∥=√{square root over (x_(k) ^(T)x_(k))}. The resulting algorithm is referred to as the normalized LMS (NLMS) algorithm. This algorithm is stable for all values of the filter input power so long as the step-size is between 0 and 2.

It is desirable to develop a version of the LMF algorithm that has a similar feature as that of the NLMS algorithm; i.e., stability for all values of the filter input power for an appropriate fixed range of the step-size. One prior art normalization technique is given as

$\begin{matrix} {{v_{k + 1} = {v_{k} + {{\mu \left( {b_{k} - {v_{k}^{T}x_{k}}} \right)}^{3}\frac{x_{k}}{{x_{k}}^{2}}}}},} & (10) \end{matrix}$

and a second version is given by

$\begin{matrix} {{v_{k + 1} = {v_{k} + {{\mu \left( {b_{k} - {v_{k}^{T}x_{k}}} \right)}^{3}\frac{x_{k}}{\delta + {\lambda {x_{k}}^{2}} + {\left( {1 - \lambda} \right){e_{k}}^{2}}}}}},} & (11) \end{matrix}$

where δ is a small positive number, 0<δ<1, and e_(k)=(e_(k), e_(k−1), e_(k−2), . . . , e_(k−N+1))^(T) is the error vector. The parameter λ is referred to as the mixing power parameter. The choice of λ is a compromise between fast convergence and low steady-state error. However, the stability of the above algorithms depends on the mean square input of the adaptive filter. To show this undesired feature for the algorithm of equation (10), we may consider the scalar case, N=1, with zero noise, b_(k)=0, and binary input, x_(k) ε{−1,1}, with μ=0.5. In this case, equation (10) implies that:

v _(k+1) =v _(k)−0.5v _(k) ³.  (12)

If v₁=1, then equation (12) implies that v₂=0.5, v₃=0.4375, v₄=0.3956, v₅=0.3647, etc. Thus, v_(k) is decaying in this case. Repeating this example with x_(k) ε{−4,4}, while keeping all other conditions unchanged, equation (10) implies that:

v _(k+1) =v _(k)−8v _(k) ³.  (13)

Again, if v₁=1, then equation (13) yields v₂=−7, v₃=2737, v₄=1.6(10¹¹), v₅=3.5(10³⁴), etc. Thus, the algorithm of equation (10) diverges in this case. This shows that the stability of the normalized LMF algorithm of equation (10) depends on the input power of the adaptive filter.

It can also be shown that the stability of the algorithm of equation (11) also depends on the input power. We may consider again the scalar case with μ=0.5, δ=0, λ=0.5, x_(k) ε{−1,1}, and b_(k)=0. In this case, equations (6) and (8) imply that e_(k) ²=v_(k) ²x_(k) ² and equation (11) implies that:

$\begin{matrix} {v_{k + 1} = {v_{k} - {\frac{v_{k}^{3}}{1 + v_{k}^{2}}.}}} & (14) \end{matrix}$

If v₁=1, then equation (14) produces v₂=0.5, v₃=0.4, v₄=0.3448, v₅=0.3082, etc. Thus, v_(k) is decaying in this case. Repeating this example with x_(k) ε{−4,4}, while keeping all other conditions unchanged, equation (11) implies that:

$\begin{matrix} {v_{k + 1} = {v_{k} - {\frac{16\; v_{k}^{3}}{1 + v_{k}^{2}}.}}} & (15) \end{matrix}$

Again, if v₁=1, then equation (15) produces v₂=−7, v₃=102.76, v₄=1541.2, v₅=23119, etc. Thus, the algorithm of equation (11) diverges in this case. This shows that the stability of the normalized LMF algorithm of equation (11) depends on the input power of the adaptive filter.

The above results regarding the dependence of the stability of the prior art NLMF algorithms on the input power of the adaptive filter suggest the need for an NLMF algorithm whose stability does not depend on the input power. Thus, a system and method for least mean fourth adaptive filtering solving the aforementioned problems is desired.

SUMMARY OF THE INVENTION

The system and method for least mean fourth adaptive filtering is a system that uses a general purpose computer or a digital circuit (such as an ASIC, a field-programmable gate array, or a digital signal processor that is programmed to utilize a normalized least mean fourth algorithm. The normalization is performed by dividing a weight vector update term by the fourth power of the norm of the regressor. The normalized least mean fourth algorithm remains stable as the filter input power increases.

The least mean fourth adaptive filter includes a finite impulse response filter having a desired output a_(k) at a time k. An impulse response of the finite impulse response filter is defined by a set of weighting filter coefficients h_(k), and an input signal of the finite impulse response filter is defined by a regressor input signal vector, x_(k), at the time k. An error signal e_(k) at the time k is calculated as a difference between the desired output a_(k) at the time k and an estimated signal given by h_(k) ^(T)x_(k), such that e_(k)=a_(k)−h_(k) ^(T)x_(k). The set of weighting filter coefficients are iteratively updated as

${h_{k + 1} = {h_{k} + {\alpha \frac{e_{k}^{3}x_{k}}{{x_{k}}^{4}}}}},$

where α is a fixed positive number step-size.

These and other features of the present invention will become readily apparent upon further review of the following specification and drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram of a plant to be identified digital signal processing techniques.

FIG. 2 is a graph illustrating simulation results of a system and method for least mean fourth adaptive filtering according to the present invention, validating step-size stability bounds for Gaussian input and noise.

FIG. 3 is a graph illustrating simulation results of the system and method for least mean fourth adaptive filtering according to the present invention, validating step-size stability bounds for uniformly distributed input and noise.

FIG. 4 is a graph illustrating simulation results of the system and method for least mean fourth adaptive filtering according to the present invention, validating step-size stability bounds for correlated input.

FIG. 5 is a graph comparing initial convergence of the present method for normalized least mean fourth (NLMF) adaptive filtering against that of conventional least mean fourth (LMF) adaptive filtering, least mean square (LMS) adaptive filtering and normalized least mean square (NLMS) adaptive filtering.

FIG. 6 is a block diagram illustrating a telecommunication system that includes echo cancellation and employs a present system and method for least mean fourth adaptive filtering according to the present invention.

FIG. 7 is a block diagram illustrating an echo canceller used in the system of FIG. 6.

Similar reference characters denote corresponding features consistently throughout the attached drawings.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The system and method for least mean fourth adaptive filtering is a system that uses a general purpose computer or a digital circuit (such as an ASIC, a field-programmable gate array, or a digital signal processor that is programmed to utilize a normalized least mean fourth algorithm. The normalization is performed by dividing a weight vector update term by the fourth power of the norm of the regressor. The normalized least mean fourth algorithm remains stable as the filter input power increases.

In order to examine the present least mean fourth adaptive filter, it is useful to first consider the scalar ease, in which N=1 (where N is the number of plant parameters) with zero noise. In such a case, equation (9) simplifies to:

v _(k+1)=(1−μx _(k) ⁴ v _(k) ²)v _(k).  (16)

The stability of the algorithm represented by equation (16) depends on the statistics of x_(k) (i.e., the plant input). This dependence is removed by using a variable step-size μ that satisfies the following:

$\begin{matrix} {{\mu = \frac{\alpha}{x_{k}^{4}}},} & (17) \end{matrix}$

assuming that x_(k)≠0 for all k and that α is a fixed positive number. In such a case, equation (16) becomes

v _(k+1)=(1−αv _(k) ²)v _(k).  (18)

Since α does not depend on x_(k), the sequence {v_(k)} generated by equation (18) also does not depend on x_(k). Thus, the stability of equation (18) does not depend on the statistics of x_(k). A sufficient condition for the decay of |v_(k)| for all k is that:

$\begin{matrix} {{0 < \alpha < \frac{2}{v_{k}^{2}}},{{for}\mspace{14mu} {all}\mspace{14mu} {k.}}} & (19) \end{matrix}$

When |v_(k)| is decreasing, |v₁| will be the largest value of |v_(k)|. Then, a sufficient condition for equation (19) is:

$\begin{matrix} {0 < \alpha < {\frac{2}{v_{1}^{2}}.}} & (20) \end{matrix}$

Equation (20) is the step-size range that is sufficient for the convergence of the algorithm represented by equation (16).

The above can now be extended to the N-dimensional LMF algorithm described by equation (9). A factor that determines the convergence of the algorithm is its behavior at the start of adaptation. At the start of adaptation, the magnitude of the deviation vector v_(k) is usually so large that the excess estimation error v_(k) ^(T)x_(k) dominates the plant noise b_(k). In such a case, equation (9) can be approximated as:

v _(k+1) ≈v _(k)−μ(v _(k) ^(T) x _(k))³ x _(k) =T _(k) v _(k),  (21)

where the transformation matrix T_(k) is given by:

T _(k) =I−μ x _(k) x _(k) ^(T) v _(k) v _(k) ^(T) x _(k) x _(k) ^(T).  (22)

The transformation matrix of equation (22) depends on ∥x_(k)∥. This dependence can be removed by using a variable step-size μ that satisfies:

$\begin{matrix} {{\mu = \frac{\alpha}{{x_{k}}^{4}}},} & (23) \end{matrix}$

assuming that x_(k)≠0 for all k and that α is a fixed positive number. In such a case, equation (22) becomes:

$\begin{matrix} {T_{k} = {I - {\alpha {\frac{x_{k}x_{k}^{T}v_{k}v_{k}^{T}x_{k}x_{k}^{T}}{{x_{k}}^{4}}.}}}} & (24) \end{matrix}$

The matrix T_(k) given by equation (24) depends on the direction of the vector x_(k), but it does not depend on its norm, ∥x_(k)∥. Inserting equation (23) into equation (5) yields:

$\begin{matrix} {h_{k + 1} = {h_{k} + {\alpha {\frac{e_{k}^{3}x_{k}}{{x_{k}}^{4}}.}}}} & (25) \end{matrix}$

Equation (25) represents the present NLMF algorithm. Due to equation (4), the weight vector update term on the right-hand side of equation (25) is proportional to the negative gradient of (e_(k)/∥x_(k)∥)⁴ with respect to h_(k). Thus, the algorithm represented by equation (25) may be viewed as a gradient algorithm based on the minimization of the mean fourth normalized estimation error E(e_(k)/∥x_(k)∥)⁴. Similar to the NLMS algorithm, the NLMF algorithm represented by equation (25) may be regularized by adding a small positive number to ∥x_(k)∥⁴ in order to avoid division by zero.

In the following, a condition on the step-size a is derived that is sufficient for the convergence of the algorithm of equation (25). First, a step-size range is derived that is sufficient for stability of the algorithm in the initial phase of adaptation. Then, a step-size range that is sufficient for steady-state stability is derived. The final step-size range is obtained from the intersection of these two ranges.

One is typically interested in mean-square stability that is based on the evolution of the mean-square deviation E(∥v_(k)∥²) with time, where E denotes the mathematical expectation operator. In the initial phase, the magnitude of E(v_(k)) is usually large in comparison with the fluctuation of v_(k) around E(v_(k)). Thus, E(∥v_(k)∥²)≈∥E(v_(k))∥² in the initial phase of adaptation. Consequently, the derivation of the stability step-size range in the initial adaptation phase can be simplified by calculating it on the base of the mean of v_(k), rather than the mean square of v_(k). This is not the case with the steady-state condition that should be derived on the base of the mean-square of v_(k).

With respect to the initial phase condition, at the start of adaptation, the magnitude of the deviation vector v_(k) is usually so large that the excess estimation error v_(k) ^(T)x_(k) dominates the plant noise b_(k). In such a case, equation (25) can be approximated by equation (21), with T_(k) being given by equation (24). Letting E(x|y) denote the conditional expectation of x given y, then a condition on a is derived that implies that:

∥E(v _(k+1) |v _(k))∥<∥v _(k)∥.  (26)

This condition means that the magnitude of the weight deviation vector is decreasing each iteration step, in a mean sense. Due to equation (21), E(v_(k+1)|v_(k))=E(T_(k)|v_(k))v_(k). Then, a sufficient condition for equation (26) is that the magnitude of all the eigenvalues of the matrix E(T_(k)|v_(k)) are less than 1. Due to equation (24), the eigenvalues of E(T_(k)|v_(k)) are equal to 1−αλ_(i)(k), where λ_(i)(k); i=1, 2, . . . , N are the eigenvalues of the matrix P(k), defined by:

$\begin{matrix} {{P(k)} \equiv {{E\left( \frac{x_{k}x_{k}^{T}v_{k}v_{k}^{T}x_{k}x_{k}^{T}}{{x_{k}}^{4}} \middle| v_{k} \right)}.}} & (27) \end{matrix}$

Thus, a sufficient condition for equation (26) is that:

∥1−αλ_(i)(k)|<1,i=1,2, . . . ,N  (28)

It is assumed that the matrix P(k) is positive definite. This assumption is a sort of persistent excitation assumption. A geometrical validation of this assumption is given below. From equation (27), the trace of the matrix P(k) is less than or equal to ∥v_(k)∥². Thus,

0<λ_(i)(k)<∥v _(k)∥² ,i=1,2, . . . ,N.  (29)

Equation (29) implies that a sufficient condition for equation (28) is that

$\begin{matrix} {0 < \alpha < {\frac{2}{{v_{k}}^{2}}.}} & (30) \end{matrix}$

Physically speaking, in order to satisfy equation (30) for all k, it is sufficient to satisfy it at k=1, since the maximum weight deviation takes place at the start of adaptation. This leads to the following condition:

$\begin{matrix} {0 < \alpha < {\frac{2}{{v_{1}}^{2}}.}} & (31) \end{matrix}$

Equation (31) is the step-size range that is sufficient for stability of the NLMF algorithm of equation (25) in the initial phase of adaptation. This step-size range does not depend on the input power of the adaptive filter. The range given by equation (31) reflects the dependence of the stability of the LMF algorithm on the weight initialization of the adaptive filter.

The following provides a geometrical validation of the above assumption that the matrix P(k) is positive definite. For any vector u in the N-dimensional space, equation (27) implies that:

$\begin{matrix} {{{u^{T}{P(k)}u} = {E\left\lfloor {{u_{x}^{2}(k)}{v_{x}^{2}(k)}} \middle| v_{k} \right\rfloor}},{where}} & (32) \\ {{u_{x}(k)} \equiv \frac{u^{T}x_{k}}{x_{k}}} & (33) \end{matrix}$

is the projection of the vector u on the vector x_(k), and

$\begin{matrix} {{v_{x}(k)} \equiv \frac{v_{k}^{T}x_{k}}{x_{k}}} & (34) \end{matrix}$

is the projection of the vector v_(k) on the vector x_(k). When x_(k) is persistently exciting, it spans the whole N-dimensional space. Then, with non-zero probability, the projections of given vectors u and v_(k) on x_(k) will be different from zero, which implies that u_(x) ²(k) v_(x) ²(k) will be positive. This implies that the right-hand side of equation (32) will be positive, which implies that the matrix P(k) is positive definite.

With regard to the steady-state condition, a sufficient condition of the steady-state stability of the mean square deviation of the regular LMF algorithm of equation (5) is given by:

$\begin{matrix} {0 < \mu < {\frac{E\left( b_{k}^{2} \right)}{10\; {E\left( b_{k}^{4} \right)}{E\left( {x_{k}}^{2} \right)}}.}} & (35) \end{matrix}$

For long adaptive filters, ∥x_(k)∥⁴ can be approximated by E(∥x_(k)∥⁴). Then, equation (23) implies that the condition of equation (35) on μ can be mapped to the following condition on α:

$\begin{matrix} {0 < \alpha < {\frac{{E\left( b_{k}^{2} \right)}{E\left( {x_{k}}^{4} \right)}}{10\; {E\left( b_{k}^{4} \right)}{E\left( {x_{k}}^{2} \right)}}.}} & (36) \end{matrix}$

Equation (36) is the step-size range that is sufficient for stability of the present NLMF algorithm of equation (25) in the steady-state.

With regard to the final step-size condition, equation (31) is sufficient for convergence of the NLMF algorithm of equation (25) in the initial phase of adaptation, while equation (36) is sufficient for stability around the Wiener solution. Both properties can be achieved by using the step-size to satisfy the following condition:

$\begin{matrix} {{0 < \alpha < \alpha_{o}},{where}} & (37) \\ {\alpha_{o} = {\min {\left\{ {\frac{2}{{v_{1}}^{2}},\frac{{E\left( b_{k}^{2} \right)}{E\left( {x_{k}}^{4} \right)}}{10\; {E\left( b_{k}^{4} \right)}{E\left( {x_{k}}^{2} \right)}}} \right\}.}}} & (38) \end{matrix}$

For a long adaptive filter, E(∥x_(k)∥⁴)/E(∥x_(k)∥²)≈Nσ_(x) ². Then, the second term in the argument of the min {.} function on the right-hand side of equation (38) is increasing in N and the signal-to-noise ratio σ_(x) ²/σ_(b) ². Thus, for the long adaptive filter, non-small signal-to-noise ratio, and non-small initial weight deviation ∥v₁∥,

$\begin{matrix} {\frac{2}{{v_{1}}^{2}} \leq {\frac{{E\left( b_{k}^{2} \right)}{E\left( {x_{k}}^{4} \right)}}{10\; {E\left( b_{k}^{4} \right)}{E\left( {x_{k}}^{2} \right)}}.}} & (39) \end{matrix}$

For illustration of equation (39), an example with N=32, binary input x_(k)ε{−σ_(x), σ_(x)}, binary noise b_(k)ε{−σ_(b),σ_(b)}, and where ∥v₁∥=1 is considered. In this example, equation (39) holds as long as the signal-to-noise ratio σ_(x) ²/σ_(b) ² is greater than ⅝. Equations (38) and (39) imply that:

$\begin{matrix} {\alpha_{o} = {\frac{2}{{v_{1}}^{2}}.}} & (40) \end{matrix}$

Thus, the final step-size condition of equation (37) is identical with the initial phase condition of equation (31). Thus, the condition of equation (31) is sufficient for the stability of the NLMF algorithm of equation (25) in applications with a long adaptive filter and non-small signal-to-noise ratio. This condition is validated by the simulation results given below.

The following simulations were performed for the case of adaptive plant identification, as illustrated in FIG. 1. The plant 100 is made of a time-invariant FIR filter 102 with N=32 parameters. The regressor vector is given by:

x _(k)(x _(k) ,x _(k−1) , . . . ,x _(k−N+1))^(T),  (41)

where x_(k) is the plant input. x_(k) is a zero mean, independent and identically distributed (HD) Gaussian sequence with variance σ_(x) ². The plant noise is a zero mean IID Gaussian sequence with variance σ_(b) ². The plant parameters are given by:

$\begin{matrix} {g_{i} = \left\{ \begin{matrix} {{\rho \; i},} & {1 \leq i \leq {N/2}} \\ {{\rho \left( {N + 1 - i} \right)},} & {{N/2} < i \leq {N.}} \end{matrix} \right.} & (42) \end{matrix}$

The value of ρ is chosen such that ∥g∥=1. In the exemplary ease of the simulation, N=32 and ρ=0.0183. The initial weight vector of the adaptive filter is h₁=0, thus ∥v₁∥=1.

To study the dependence of the stability of the conventional NLMF algorithms of equations (10) and (11) and the present NLMF algorithm of equation (25) on the input power of the filter, the algorithms are simulated over a wide range of σ_(x), ranging from 0.2 to 1000. The considered value of σ_(b) is 0.01. For the algorithm represented by equation (10), μ=1. For the algorithm represented by equation (11), μ=1, δ=0, and λ=0.5. For the algorithm represented by equation (25), α=1. The simulations have shown that the conventional NLMF algorithms of equations (10) and (11) diverge for σ_(x)=1 and above, whereas the present NLMF algorithm of equation (25) is non-divergent for all σ_(x).

In order to validate the step-size condition given by equation (31) of the present NLMF algorithm, the maximum step-size for which the algorithm is convergent is determined by simulations and compared with the step-size bound provided by equation (31). This is performed at several values of σ_(x). The considered noise variance, plant parameter vector, and initial weight vector of the adaptive filter are as given above. The results are shown in FIG. 2. The stability step-size bound obtained by simulation is greater than that provided by the condition of equation (31). This is a validation of equation (31). It should be noted that the stability condition of equation (31) is sufficient, but not necessary. This means that the algorithm is not only convergent for all values of a satisfying equation (31), but it may also converge for some values of a that do not satisfy equation (31). Therefore, all that is needed from the simulations is to show that the range defined by equation (31) is included in the range obtained by simulations, which is the case with the results of FIG. 2. It is not required to show that the step-size range defined by equation (31) coincides with the range obtained by simulations.

To validate the step-size condition of equation (31) for non-Gaussian plant input and plant noise, the simulations of FIG. 2 are re-performed for uniformly distributed input and noise. The considered noise variance, plant parameter vector, and initial weight vector of the adaptive filter are the same as those considered in FIG. 2. The results are shown in FIG. 3. These results validate the sufficiency of the step-size condition given by equation (31) for the stability of the algorithm over a wide range of the input power of the adaptive filter.

To validate the step-size condition given by equation (31) for non-white input of the plant, the simulations of FIG. 2 are re-performed for an autoregressive x_(k) satisfying

x _(k) =βx _(k−1)+σ_(x)√{square root over (1−β²)}w _(k),0≦β<1,  (43)

where w_(k) is a zero mean, unity variance IID Gaussian sequence. The parameter β controls the degree of correlation of the sequence {x_(k)}; the greater β is, the stronger the correlation. In the simulations, the considered value of β was 0.95, which corresponds to a strong correlation of the sequence {x_(k)}. This implies both a strong correlation among the components of the regressor and a strong correlation between successive regressors. The noise is white Gaussian. The considered noise variance, plant parameter vector, and initial weight vector of the adaptive filter are the same as those considered in FIG. 2. The results are shown in FIG. 4. These results validate the sufficiency of the step-size condition given by equation (31) for the stability of the algorithm over a wide range of the input power of the adaptive filter.

Finally, FIG. 5 compares the initial convergence of the present NLMF algorithm with those of the LMF, NLMS, and LMS algorithms for white Gaussian x_(k) and b_(k) with σ_(x)=1, and σ_(b)=0.1. FIG. 5 shows the evolution of the excess mean square error (MSE) of the algorithms with time. The instantaneous excess MSE is defined as E(ε_(k) ²), where ε_(k) is defined by equation (8). The step-sizes of the algorithms are chosen such that they have the same steady-state excess MSE. The step-size of the present NLMF algorithm is 1. The resulting steady-state excess MSE is 1×10⁻⁵. The step-sizes of the LMF, NLMS, and LMS algorithms having the same steady-state excess MSE are 0.001, 0.002, and 6.25×10⁻⁵, respectively. The instantaneous excess MSE is evaluated by averaging ε_(k) ² over 1,000 independent runs.

FIG. 5 shows that the initial convergence of the present NLMF algorithm is almost the same as that of the LMF algorithm for the same steady-state excess MSE. This means that the present NLMF algorithm retains the fast initial convergence advantage of the LMF algorithm. With regard to the similarity of the transient performances of the LMF and NLMF algorithms, for large N, ∥x_(k)∥²≦Nσ_(x) ². This implies that the behavior of the NLMF algorithm of equation (25) will be close to that of the LMF algorithm of equation (5) with μ=α/(N²σ_(x) ⁴). This condition is satisfied by the values of μ and α considered in FIG. 5. FIG. 5 also shows that the initial convergence of the NLMS algorithm is almost the same as that of the LMS algorithm and that they are significantly slower than the LMF and NLMF algorithms.

FIGS. 6 and 7 illustrate a particular application of the present system and method for least mean fourth adaptive filtering. Echoes are generated whenever part of a speech is reflected back to the source by the floor, walls, or other neighboring objects. An echo is noticeable (or audible) only if the time delay between it and the speech exceeds a few tens of milliseconds. As the result of impedance mismatches in telephone circuits, echoes are also generated. The echoes arise in various situations in telecommunications networks and impair communication quality. Long-delay echoes are often irritating to the user, whereas shorter ones, called “sidetones”, are actually desirable and are intentionally inserted in telecommunications networks to make the telephone circuit seem “alive”.

Echoes with long delay are observed only on long-distance connections. To clearly understand the echo phenomenon, FIG. 6 illustrates a typical long distance telephone connection 200. Central to such a connection, a pair of two-wire segments 202, 204 are provided, the ends of which connect a customer C to a central office O, along with a four-wire carrier section 206 (which might include satellite links). The two-wire circuits 202, 204 are bidirectional, whereas the four-wire circuits 206 are made of two distinct channels, one for each direction.

In order to counteract the echo phenomenon, schemes must be developed to either completely eliminate it (i.e., the ideal requirement), or to at least substantially reduce its adverse effect so as to achieve a transmission of good quality. Echo cancellation is a suitable area for the application of adaptive filtering. An adaptive echo canceller 208, 210 estimates the responses of an underlying echo-generating system in real time in the face of unknown and time-varying echo path characteristics, generates a synthesized echo based on the estimate, and cancels the echo by subtracting the synthesized echo from the received signal.

In FIG. 6, echo cancellers 208, 210 are identical, and FIG. 7 illustrates a block diagram of the echo canceller 208. The synthetic echo y_(k)′ is generated by passing the reference signal through an adaptive filter that ideally matches the impulse response of the echo path. Thus, the transversal filter generates an estimate y_(k)′ of the echo, given by:

$\begin{matrix} {{y_{k}^{\prime} = {\sum\limits_{i = 0}^{N - 1}\; {h_{i,k}x_{k - i}}}},} & (44) \end{matrix}$

where {h_(i,k)} is the estimated echo-path impulse response sample, x_(i) is the input sample to the i^(th)-tap delay, and N is the number of tap coefficients. While passing through the hybrid 212, the speech from customer C results in the echo signal y_(k). This echo, together with the speech from office O, r_(k), constitutes the desired response for the adaptive filter. The canceller error signal is obtained as follows:

ζ_(k) =y _(k) −y _(k) ′+r _(k) =e _(k) +r _(k).  (45)

The error signal e_(k) of FIG. 7 is used to control the adjustments to the adaptive filter coefficients according to the present normalized least mean fourth adaptive algorithm in order to continuously improve the echo estimate y_(k)′.

Ideally, the system eventually converges to the condition ζ_(k)=r_(k). The effect of this ideal condition on the echo cancellation is naturally of some concern. Convergence of the echo to zero, however, is not an adequate criterion of performance for a system of this type, since this is possible only if y_(k) is exactly representable as the output of a fixed-tap filter. A better performance criterion is the convergence of the filter's impulse response to the response of the echo path.

It is to be understood that the present invention is not limited to the embodiments described above, but encompasses any and all embodiments within the scope of the following claims. 

We claim:
 1. A method for least mean fourth adaptive filtering, comprising the steps of: (a) providing a finite impulse response filter having a desired output a_(k) at a time k, the finite impulse response filter having an impulse response defined by a set of weighting filter coefficients h_(k), and an input signal defined by a regressor input signal vector, x_(k), at the time k; (b) calculating an error signal e_(k) at the time k as a difference between the desired output a_(k) at the time k and an estimated signal given by h_(k) ^(T)x_(k), such that e_(k)=a_(k)−h_(k) ^(T)x_(k); (c) setting k=k+1 and iteratively updating the set of weighting filter coefficients for time k+1 as ${h_{k + 1} = {h_{k} + {\alpha \frac{e_{k}^{3}x_{k}}{{x_{k}}^{4}}}}},$ where α is a fixed positive number step-size; and (d) returning to step (b) with the updated set of weighting filter coefficients.
 2. The method for least mean fourth adaptive filtering as recited in claim 1, wherein the desired output a_(k) at the time k is defined as a_(k)=g^(T)x_(k)+b_(k), wherein g is a vector composed of plant parameters and b_(k) represents plant noise.
 3. The method for least mean fourth adaptive filtering as recited in claim 2, further comprising the step of calculating a weight deviation vector v_(k) at the time k as v_(k)≡h_(k)−g.
 4. The method for least mean fourth adaptive filtering as recited in claim 3, further comprising the step of selecting the fixed positive number step-size α such that $0 < \alpha < {\frac{2}{v_{k}^{2}}.}$
 5. A least mean fourth adaptive filter circuit, comprising a finite impulse response filter circuit having a desired output a_(k) at a time k, an impulse response defined by a set of weighting filter coefficients h_(k), and an input signal defined by a regressor input signal vector at the time k, x_(k), the finite impulse filter circuit further having: means for calculating an error signal e_(k) at the time k as a difference between the desired output a_(k) at the time k and an estimated signal given by h_(k) ^(T)x_(k), such that e_(k)=a_(k)−h_(k) ^(T)x_(k); and means for iteratively updating the set of weighting filter coefficients as ${h_{k + 1} = {h_{k} + {\alpha \frac{e_{k}^{3}x_{k}}{{x_{k}}^{4}}}}},$ where α is a fixed positive number step-size.
 6. A least mean fourth adaptive filter circuit as recited in claim 5, wherein the desired output a_(k) at the time k is defined as a_(k)=g^(T)x_(k)+b_(k), wherein g is a vector composed of plant parameters and b_(k) represents plant noise.
 7. The least mean fourth adaptive filter circuit as recited in claim 6, wherein the fixed positive number step-size α is selected such that ${0 < \alpha < \frac{2}{v_{k}^{2}}},$ wherein v_(k) is a weight deviation vector at the time k, given by v_(k)≡h_(k)−g.
 8. A signal processor, comprising a processor and software means for processing a signal, the software means being executable by the processor, the software means including; means for processing the signal through a finite impulse response filter having a desired output a_(k) at a time k, an impulse response defined by a set of weighting filter coefficients h_(k), and an input signal defined by a regressor input signal vector at the time k, x_(k); means for calculating an error signal e_(k) at the time k as a difference between the desired output a_(k) at the time k and an estimated signal given by h_(k) ^(T)x_(k), such that e_(k)=a_(k)−h_(k) ^(T)x_(k), and means for iteratively updating the set of weighting filter coefficients as ${h_{k + 1} = {h_{k} + {\alpha \frac{e_{k}^{3}x_{k}}{{x_{k}}^{4}}}}},$ where α is a fixed positive number step-size. 